Disclosure Risk Measurement with Entropy in Two-Dimensional Sample Based Frequency Tables

نویسندگان

  • Laszlo Antal
  • Natalie Shlomo
  • Mark Elliot
چکیده

We extend a disclosure risk measure defined for population based frequency tables to sample based frequency tables. The disclosure risk measure is based on information theoretical expressions, such as entropy and conditional entropy, that reflect the properties of attribute disclosure. To estimate the disclosure risk of a sample based frequency table we need to take into account the underlying population and therefore need both the population and sample frequencies. However, population frequencies might not be known and therefore they must be estimated from the sample. We consider two probabilistic models, a log-linear model and a so-called Pólya urn model, to estimate the population frequencies. Numerical results suggest that the Pólya urn model may be a feasible alternative to the log-linear model for estimating population frequencies and the disclosure risk measure.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Measuring Disclosure Risk and Information Loss in Population Based Frequency Tables

Frequency tables disseminated by statistical agencies have always been of high interest. However, the agencies have to ensure that the risk of identifying individuals and disclosing individuals’ attributes from the released data is low. Therefore they assess the risk of disclosure and apply statistical disclosure control (SDC) methods if necessary. The main objective of this work is to measure ...

متن کامل

Risk measurement and Implied volatility under Minimal Entropy Martingale Measure for Levy process

This paper focuses on two main issues that are based on two important concepts: exponential Levy process and minimal entropy martingale measure. First, we intend to obtain   risk measurement such as value-at-risk (VaR) and conditional value-at-risk (CvaR) using Monte-Carlo methodunder minimal entropy martingale measure (MEMM) for exponential Levy process. This Martingale measure is used for the...

متن کامل

A posteriori Disclosure Risk Measure for Tabular Data Based on Conditional Entropy∗

Statistical database protection, also known as Statistical Disclosure Control (SDC), is a part of information security which tries to prevent published statistical information (tables, individual records) from disclosing the contribution of specific respondents. This paper deals with the assessment of the disclosure risk associated to the release of tabular data. So-called sensitivity rules are...

متن کامل

Statistical Disclosure Control Methods for Census Frequency Tables

This paper provides a review of common statistical disclosure control (SDC) methods implemented at Statistical Agencies for standard tabular outputs containing whole population counts from a Census (either enumerated or based on a register). These methods include record swapping on the microdata prior to its tabulation and rounding of entries in the tables after they are produced. The approach ...

متن کامل

Cell Suppression to Limit Content-Based Disclosure

The increasing demand for information, coupled with the increasing capability of computer systems, has compelled information providers to reassess their procedures for preventing disclosure of conndential information. General logical and numerical methods exist to determine, prior to release, if disclosure can occur|either directly or through inference. One method uses linear programming techni...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015